Exploring the growth trait molecular markers in two sheep breeds based on Genome-wide association analysis

Growth traits are quantitative traits controlled by multiple micro-effect genes. we identified molecular markers related to sheep growth traits, which formed the basis of molecular breeding. In this study, we randomly selected 100 Qira Black sheep and 84 German Merino sheep for the blood collection the jugular vein to genotype by using the Illumina Ovine SNP 50K Bead Chip. quality control criteria for statistical analysis were: rejection detection rate < 90% and minimum allele frequency (MAF) < 5%. Then, we performed Genome-wide association studies (GWAS) on sheep body weight, body height, body length, and chest circumference using mixed linear models. After getting 55 SNPs with significant correlation, they were annotated by reference genome of Ovis aries genome (Oar_v4.0) and We obtained a total of 84 candidate genes associated with production traits (BMPR1B, HSD17B3, TMEM63C, etc.). We selected BMPR1B for population validation and found a correlation between the FecB locus and body weight traits. Therefore, this study not only supplements the existing knowledge of molecular markers of sheep growth traits, but also has important theoretical significance and reference value for the mining of functional genes of sheep growth traits.


Introduction
Growth traits have economical importance in sheep production. To increase the income of herdsmen and speed up sheep breeding, molecular breeding technology has been used in sheep breeding. With the publication of the HapMap Project, Genome Sequencing Project, and the emergence of commercial high-density SNP chips (cattle 54 K, pigs 60 K, chickens 60 K, horses 60 K, and sheep 50 K) and genotyping techniques, GWAS has become an important method to detect the candidate genes for complex quantitative traits in livestock and poultry.
GWAS studies on different traits were carried out in sheep, including the angular type [1], polyangular traits [2], coat color [3], and polyparous traits [4]. Zhang et al. [5] used the ovine 50 K SNP chip to conduct GWAS on 329 pure sheep (Sunite sheep, German Merino sheep, and Dolper sheep). Candidate genes for growth and meat traits (birth weight, weaning weight, weight at 6 months of age, eye muscle area, backfat thickness, daily gain before weaning, daily assessed by determining the A260/280 in the biophotometer (type: DS-11,DeNovix, USA). The samples were accepted only when the A260/280 values were between 1.8 and 2.0. Then, the DNA was quantified and samples were diluted to a minimum and a maximum concentration of 50 ng/μL and 150 ng/μL, respectively, for the Illumina Ovines SNP 50 chip. The BeadScan software converts the images and the Genome Studio software processes the data. The Plink 1.07 [12] software was used for data quality control. The quality control criteria were: the individual samples with a detection rate of < 95% were removed, the SNPs detection rate was < 90%, the minimum allele frequency was < 5%, and the p-value of Hardy-Weinberg equilibrium test was < 1×10 −6 .

Genetic relationship matrix and principal component analysis
GCTA software [13] was used for principal components analysis (PCA) analysis of filtered SNPs data, which was based on the genetic correlation identification between individuals and it also represented the main components of population structure. The genetic matrix was calculated by the KING (http://people.virginia.edu/~wc9c/KING/manual.html) software and mapped by R language.

Statistical methods and models
GWAS analysis was performed using the TASSEL 5.0 software (https://www.maizegenetics.net), and was based on a mixed linear model (MLM). GWAS analysis of four body weight traits in 184 sheep was performed using genotypes obtained from Illumina OvineSNP50 microarray, with full consideration of population stratification and individual relationship. The statistical model is as follows: where y is the vector f the observed population phenotype; μ is the mean vector; g is the fixed environmental effect vector; b is the SNP effect vector; u is the multigene effect vector; x, w, and m are the association matrices of g, b, and u, respectively; and e is the residual.
It was assumed that both u and e obeyed normal distribution. e is the random vector of Ṕ 1, the residual in a general sense, e~N (0, R), R = G s 2 , s 2 is the residual variance, and G is the residual correlation matrix. u is the random vector of M´1, called the random effect of u, uÑ (0, Y), Y = s 2 K, s 2 is the additive genetic variance of Genome-wide association analysis or multi-locus analysis; K is the correlation matrix of kinship. Threshold is determined by the Bonferroni correction method. SNPs are considered to be strictly Genome-wide significant at P < 0.05/ N and potentially significant at P < 0.01/ N. N is the number of bits after quality control of the genome chip.

Candidate gene annotation
Statistically significant SNPs annotations referencing the genome of Ovis aries genome (Oar_v4.0) were obtained. The functional annotation of candidate genes was performed referencing the NCBI databases (http://www.ncbi.nlm.nih.gov/gene) and OMIM database (http://www.ncbi.nlm.nih.gov/omim). design primers for the obtained full-length sequence of the BMPR1B. The primer information is shown in Table 1. All primers were synthesized by the Shanghai (China) Sheng gong Bioengineering Company.
The 25 μL PCR reaction system contained: 2× Taq Master Mix-12.5 μL, upstream and downstream primers-1 μL (0.5 μmol/L), DNA template 1 μL (50 ng/μL), ddH 2 O -9.5 μL. The PCR amplification reaction conditions were: pre-denaturation at 94˚C for 5 min; denaturation at 94˚C for 45 s, annealing temperature for 30 s, extension at 72˚C for 45 s, a total of 35 cycles; the final extension at 72˚C for 5 min; followed by storage at 4˚C. PCR products were detected by electrophoresis and sent to the Shanghai Shenggong Company for sequencing.

Statistical analysis of data
Sequencing results were compared on DNASTAR [14] to search for polymorphic loci. Using the SPSS 26.0 [15] software, a one-way ANOVA combined with Duncan's model was used to analyze the effects of different loci and genotypes on growth traits of Qira Black sheep ewes. The results were expressed as mean ± standard deviation and p < 0.05 was used as the criterion for significance of difference.

Quality control and data analysis
After quality control, we obtained 46,871 SNPs. We conducted a descriptive analysis on the observed values, thus providing a preliminary analysis of their inherent information to describe the overall characteristics of groups. Descriptive statistical analysis is the basic requirement and premise of the follow-up statistical analysis. The growth traits of the resource population in this study included body weight, body height, body length, and chest circumference. Table 2 shows the analysis indexes of growth traits of population resource sheep.  Fig 2 shows the genetic structure distribution in the two sheep breeds, which causes the difference in breeds structure post selection. Therefore, population stratification should be considered and corrected in subsequent Genome-wide association analysis.

Genome-wide association analysis and gene annotation results
GWAS analysis results showed that 55 SNPs were significantly correlated with the four growth traits (body weight, body height, body length and chest circumference) at the chromosomal level. Using genome of Ovis aries genome (Oar_v4.0) and public database information like the NCBI, we compared and annotated the SNPs loci with significant associations with productive traits obtained in the GWAS analysis results for different growth traits (S1-S4 Tables).
The results were analyzed by GWAS, and the original data comprised three body size traits and one body weight trait. Fig 3 shows the Manhattan diagram of four growth traits, with 46,871 SNPs being distributed on 27 chromosomes for association analysis. The significance level threshold after correction was 1.0×10 −6 .
From the Table 3, we can see that significant SNPs were found through GWAS analysis and corresponding basic information. We screened the candidate genes for related traits in body length, body height, body weight, and chest circumference. According to the P-value of 1.0×10 −6 locus, we found 55 SNPs related to these traits along with 84 nearby genes (Tab 3). The names, physical locations, and neighboring genes of SNPs significantly related to growth traits are shown in Table 3.

Amplification results of PCR products
PCR amplification results showed that the amplified region of BMPRIB gene was 192 bp (Fig 5), which was consistent with the expected target band size, and it also lacked heterozygous bands, which meets the requirements of subsequent tests.

Sequencing verification and mutation site identification
Sequencing results showed that BMPR1B gene had a mutation site (Fig 6): g.431965A > G; at the 431965A > G mutation, resulting in three genotypes: AA, AG, and GG.

Association analysis of BMPRIB Gene polymorphism and growth traits of Qira Black sheep
As can be seen from Table 4, the BMPR1B gene g.431965A>G locus in the Qira Black sheep, The weight of the AA genotype at the G locus was significantly higher than those of the AG and GG genes (p < 0.05). The weight of the AG genotype did not differ significantly from that of the GG genotype (P > 0.05).

Discussion
The false positive phenomenon of GWAS analysis was mainly due to population stratification, which eventually gave rise to multiple SNPs loci associated with the concerned traits in the GWAS analysis results [16]. To reduce these false-positive results, population stratification and relatedness among individuals were fully considered. We used PCA and breed effect to solve the population stratification phenomenon. As can be seen from the QQ-plot, there was no population stratification phenomenon in the corrected population, and the GWAS analysis results based on the MLM were relatively reliable. Additionally, the TASSEL 5.0 software provides a kinship matrix, two correction a group the genetic relationship between the two individuals, to improve the effectiveness of the GWAS analysis. In this study, we performed a GWAS on the Qira Black sheep and German Merino sheep via genotyping data by using a medium-density chip containing 46,871 SNPs. We detected 55 SNPs that were significantly associated with growth traits. From the genomic screening information, we found over 10% of the genes to be located in the sheep chromosomes 1, 2, 6, and 12. Among them, 10 loci were located on chromosome 1 (OAR1_223367286.1, S06270.1, S33133.1, etc.). There were 6 SNPs loci on chromosome 2, 6, and 12 (OAR2_179242768.1, OAR6_111740006.1, OAR12_60622021.1, etc.) We obtained a total of 84 genes through gene annotation. Based on gene enrichment analysis, we can preliminarily infer that these loci were important molecular markers that affected the growth and meat production traits of sheep. Zhang et al. [5] had obtained 36 significant SNPs for seven traits, while we obtained 55 significant SNPs for 4 traits (body weight, body height, body length, and chest circumference), which enriched the SNPs related to sheep growth and development. Furthermore, Jiang et al. Merino sheep. However, we identified 9 and 13 SNPs correlated with body height and chest circumference in this study, respectively. Since we had selected different germplasm resources for this study, we obtained the results at different loci. Although some of the annotated genes have already been previously reported, we found genes related to growth and development traits, including POMK, MEIS1, CROT, BMPR1B, FABP5, SPATA17, etc. Based on the NCBI and Gene Cards database annotations, we found that among the meat quality traits, genes related to growth traits mainly included POMK, BMPR1B, FABP5, RPL17, EFNA5, NELL1 and PPARGC1A.
The protein O-mannose kinase (POMK), located on chromosome 26 of sheep (37073963~37090331 bp), encodes a vital glycosylated and functional dystrophic glycan complex, which is critical to the extracellular matrix composition. Zhu et al. [19] reported that the POMK lesions can result in an abnormal functioning α-dystroglycan, which affects the development of the muscle and brain, ultimately resulting in congenital muscular dystrophy. The underlying mechanism of this phenomenon is still unclear. Since the seequence alignment revealed that it lacks the highly conserved catalytic residues of typical kinases, it has long been considered as a "pseudokinase" that lacks kinase activity.
Carnitine O octyltransferase (CROT) is located on the sheep chromosome 4 (34416632~34467060 bp). It is a peroxidase that converts short-chain fatty acids into carnitine in the mitochondrial matrix. In combination with carnitine palmityl transferase (CPT1A), fatty acid oxidation is performed. The CROT gene regulates fatty acid metabolism by controlling the enzyme carnitine O-octyltransferase. The CROT gene in poultry, especially chickens, has been rarely reported. MiR-33 was found to target the silencing of fatty acid β oxidation-related genes CROT and HADHB [20,21].
Bone Morphogenetic protein receptor type 1B (BMPR1B) is located on the sheep chromosome 6 (30030664~30482585 bp). BMPs promote cartilage and bone tissue formation by binding to the bone morphogenetic protein receptor (BMPR) [22]. It also regulates biological functions, like cell proliferation, differentiation, migration and apoptosis, cell support, and embryonic development [23,24]. BMPR1B-induced follicular growth and development directly affects the egg production and is regulated by both autocrine and paracrine factors. The transforming growth factor β (TGF-β) family regulates follicular development through autocrine  action on the granulosa cells, oocytes, and follicular membrane cells [25][26][27]. Members of the family include bone morphogenetic proteins (BMPs), growth differentiation factors (GDFs), activins, and inhibitant. BMPR1B is a major gene affecting multiple lambs, which encodes the serine/threonine kinase activity containing receptor of multiple BMPs, which is vital in follicular development, growth, and immunity, and is closely related to animal reproduction [28]. In recent years, multiple studies have shown that the polymorphism of this gene is significantly associated with the early growth rate, body height, and body length of livestock [29]. Fatty Acid Binding Protein 5 (FABP5,e-FABP), located on the sheep chromosome 9 (57575800~57580855 bp), is one of the important members of the fatty acid protein gene family. Knockdown of the FABP5 gene leads to the cellular triglyceride accumulation, decreased cholesterol level, and decreased secretion of the ApoB100 protein and lipoprotein-like particles [30]. In studies of FABP5 gene knockout mice, deletion of this gene caused sebaceous gland atrophy, sebum volume increase, and composition change [31]. Fang et al. [32] found in oral cancer cells that the overexpression of FABP5 up-regulated the MMP-9 gene expression, which increased cell proliferation and invasion. Estell et al. [33] found that this gene is closely related to the pig fat deposition, which was consistent with the results of Ojeda [34]. Therefore, FABP5 gene is not only closely related to diverse cancer types, but also has been widely reported in the field of livestock and poultry molecular breeding research. However, no study has reported on the growth traits of sheep.
Myeloid ectropic viral integration site 1 (Meis1) is located on the sheep chromosome 3 (41789220~41936042 bp). Meis1 is an important regulator of cardiac differentiation and is involved in normal cardiac development. It also helps regulate the homeostasis of adult cardiac myocytes, and its mutation can also cause cardiac conduction defects [35]. In the process of cardiac differentiation, the expression patterns of Meis1 and heart-specific gene Nkx2.5 overlap, with the regulatory mechanisms of the two common targeted genes being consistent both temporally and spatially [36]. Meis1 gene is not only involved in cell proliferation and differentiation, but also actively involved in blood vessel development and hematopoiesis, and its malfunctioning closely related to leukemia. Meis1 is expressed in different solid tumors and is involved in the proliferation and differentiation of tumor cells. In angiogenesis and development, the Meis1 gene is vital in the vascular endothelial growth factor (VEGF) pathway and hematopoietic stem cell (HSC) mediated angiogenesis.
Ribosomal protein L17 (RPL17), is located on the sheep chromosome 4 (64829711~64830378 bp). It is a part of the large subunit, belonging to the ribosomal protein L22 family, which is the core member of the large subfamily of ribosomal protein. It is the only ribosomal protein that can interact with all six domains of the 23S rRNA, which is crucial for guiding the correct folding and conformation stability of 23S rRNA. It is also one of the six translocation binding sites on the surface of the ribosome. In addition, Paola et al. [37] found that RPL17 was differentially expressed in the cleavage process of Xenopus laevis, and speculated that it might play a regulatory role in the cleavage. Zhong et al. [38] found that the RPL17 gene expression level was different at different stages of the Japanese flounder embryogenesis, and it was the highest at the tail bud stage. The ribosomal protein gene was positively expressed during cell proliferation. In this study, RPL17 was expressed in all tissues of S. japonicus, indicating that RPL17 helps regulate the growth and development of tissue. Therefore, RPL17 gene is highly expressed in the coelomic cells, and is important in regulating their development, cell division, proliferation, and differentiation.
Ephrin A5 (EFNA5), a member of the receptor tyrosine kinases (RTKs) family, is located on the cell membrane. It is located on the sheep chromosome 5 (103981768~104274426 bp). Studies on the EFNA5 gene in other fields are continuously increasing, with diverse new physiological functions already been reported. The mammalian EFNA5 gene regulates proliferation, apoptosis, differentiation, adhesion, and migration in cells [39]. It is involved in lens development and maintenance [40], osteosarcoma [41], and plays a role in follicular formation in female mammals [42]. Recent studies have found that the EFNA5 gene can participate in fat thermogenesis and also regulate lipid metabolism [42], which may be a future therapeutic target for obesity and is also related to the meat quality traits of economic animals [43]. An et al. [43] conducted Genome-wide analysis of the six growth traits, including height, body length, hip height, heart size, abdominal size, and tube bone size in the Simmental cattle at three growth stages (6, 12 and 18 months after birth). They found multiple candidate genes, with EFNA5 being one of them. Additionally, the EFNA5 gene was also identified as a candidate gene affecting the growth traits of broilers [44]. These studies suggest that EFNA5 may help regulate the meat quality traits of animals by promoting adipocyte differentiation. EFNA5 also plays an important role in the individual development and physiological function [45].
Nel-like type 1 (NELL1) is located on sheep chromosome 21 (20804714~21839088 bp), and is a novel growth factor associated with craniosynostosis with high specificity for bone and chondrocyte lines [46]. The NELL1 gene was first isolated from the human fetal brain cDNA library, and it is important for promoting osteogenic differentiation and bone regeneration during the process of bone tissue development [47]. Recent studies have shown that [48][49][50][51] NELL1 can promote the growth of osteogenic tissue, while osteogenesis and lipid formation are two opposite processes. Overexpression of NELL1 promotes osteoblast differentiation and mineralization, while down regulation of NELL1 inhibits osteoblast differentiation [52]. Multiple animal model studies have confirmed that NELL1 can also promote bone and cartilage regeneration [53][54][55].
Peroxisome proliferators-activated receptors-γ coactivator lA (PPARGC1A), located on the sheep chromosome 6 (position 43944708~44662482 bp), is an important gene, which was first discovered and reported in the screening of the mouse brown fat cDNA library [56]. As a key nuclear transcription co-activator, PPARGC1A can bind to multiple transcription factors and participate in a series of orderly metabolic processes, thus being important in regulating mitochondrial biosynthesis, glucose metabolism, fatty acid oxidation, muscle fiber type transformation, etc [57][58][59]. PPARGC1A co-localized with the quantitative trait loci (QTL) associated with fatty acid synthesis and carcass traits [60,61]. Studies have shown that the PPARGC1A gene has SNPs that are significantly associated with the backfat thickness and meat color of suet [62][63][64]. The polymorphism loci of the PPARGC1A gene that were significantly associated with the growth traits (body weight, daily gain, etc.), had previously been detected in cattle, sheep, and chickens [65][66][67]. However, the correlation between the PPARGC1A gene and daily gain and feed conversion rate in pigs was rarely reported.

Conclusion
In this study, we conducted an association analysis based on the SNPs data and growth traits of the Qira Black sheep and German Merino sheep. We obtained 55 significant SNPs loci, and there were a total of 84 genes near these SNPs. Mutant loci existed in the BMPR1B gene, and association analysis of growth traits with SNPs loci by SPSS 26.0 software revealed three genotypes at the g.431965A>G locus of the BMPR1B gene. Population validation of BMPR1B confirmed our study results. Therefore, it has both theoretical and practical significance for the molecular breeding of sheep growth performance.
Supporting information S1